Skip to content

Upgrade Struct2Tensor to TensorFlow 2.21.0#59

Open
vkarampudi wants to merge 32 commits intogoogle:masterfrom
vkarampudi:testing1
Open

Upgrade Struct2Tensor to TensorFlow 2.21.0#59
vkarampudi wants to merge 32 commits intogoogle:masterfrom
vkarampudi:testing1

Conversation

@vkarampudi
Copy link
Copy Markdown
Collaborator

@vkarampudi vkarampudi commented Apr 10, 2026

This PR upgrades struct2tensor to support TensorFlow 2.21.0 and resolves various build system incompatibilities, dependency conflicts, and compilation errors encountered during the process.

Key Changes

1. TensorFlow 2.21.0 Dependency Resolution Conflicts

  • Updated .bazelversion to 7.7.0 to align with TensorFlow 2.21.0 requirements.
  • Updated WORKSPACE to fetch TensorFlow 2.21.0
  • Problem: The build failed with numerous errors regarding missing or conflicting external repositories (e.g., rules_java, local_config_cuda, local_config_tensorrt, llvm-raw, etc.).
  • Reason: Upgrading to TensorFlow 2.21.0 introduced new transitive dependencies and assumed the presence of toolchains (like Java and GPU-specific configurations) not directly needed by struct2tensor.
  • Fix: We created multiple dummy repositories in the WORKSPACE file to satisfy these unresolved references without adding unnecessary overhead. We also applied custom patches to TensorFlow’s source files to remove Java, CUDA, ROCm, and TensorRT dependencies that were causing failures.

2. Python 3.12, 3.13 Support and Python 3.9 Drop

  • Problem: The codebase needed to be updated to support Python 3.12, 3.13 and drop support for the deprecated Python 3.9, aligning with newer TensorFlow requirements.
  • Reason: Evolving ecosystem standards and dependency compatibility demanded a Python version upgrade.
  • Fix: We modified setup.py and environment.yml to declare support for Python 3.12, 3.13 dropped Python 3.9, and pinned compatible versions for critical libraries like pyarrow (>=14), h5py, and ml-dtypes.

3. Protobuf Version Mismatch (Compiler vs. Headers)

  • Problem: Build failed with fatal errors stating that generated Protobuf files (e.g., field_mask.pb.h) were generated by a newer version of protoc than the Protocol Buffer headers being included in compilation.
  • Reason: Bazel 7’s default Bzlmod behavior ignored the manual overrides specified in WORKSPACE, causing it to fetch mismatched Protobuf compilers and headers.
  • Fix: We explicitly disabled Bzlmod in .bazelrc using common --noenable_bzlmod and moved the register_toolchains call to the very top of the WORKSPACE file to enforce correct resolution priority. We also applied a specialized patch (protobuf_tensorflow.patch) to fix macro and namespace issues.

4. Test Sharding Validation Errors

  • Problem: Running benchmark tests threw errors stating that "Sharding was requested, but the test runner did not advertise support".
  • Reason: Bazel 7 enforces stricter checks on test sharding protocols by default.
  • Fix: Added test --noincompatible_check_sharding_support to .bazelrc to allow the tests to skip this check and execute successfully.

5. CI Assembler Error (--gsframe)

  • Problem: The CI build failed with an error from the assembler: as: option '--gsframe' doesn't allow an argument.
  • Reason: The flag --gsframe=no was added to fixed local build issues on GCC 15, but the CI runner uses an older assembler that does not support this flag.
  • Fix: We moved the flag to a specific --config=sframe_fix in .bazelrc. This allows default builds (including CI) to run without it, while local users with GCC 15 can use bazel build --config=sframe_fix to avoid linker errors.

6. Added CI Pre-Commit workflow

  • Problem: we don't have a workflow to check the lint Issues
  • Fix: We Implemented new pre-commit workflow that check lint Issue on each PR workflow run to main branch

7. Simplified Dependencies in setup.py

  • Problem: Conditional dependencies for protobuf based on Python version were present in setup.py.
  • Reason: Historical compatibility constraints that are no longer strictly necessary for the targeted versions.
  • Fix: Simplified the protobuf dependency to a single constraint protobuf>=4.25.2,<7.0.0 which satisfies Python 3.10, 3.11, and 3.12.

8. Updated CI Workflow for Multi-Version Testing

  • Problem: The CI workflow failed to resolve Python version conflicts and specific dependency versions (Keras) when using a matrix.
  • Reason: environment.yml had a fixed Python version (3.12) and fixed Keras version (3.13.2) which conflicted with matrix versions or Python 3.10 compatibility.
  • Fix: Added a python-version matrix ["3.10", "3.11", "3.12"] to build.yml. Updated environment.yml
    to use a Python version range (>=3.10,<3.13) and relaxed Keras constraint to keras>=3.0.0

Verification

All tests passed successfully locally:

bazel test //... --noincompatible_check_sharding_support ...

@vkarampudi vkarampudi changed the title Testing1 Upgrade Struct2Tensor to TensorFlow 2.21.0 Apr 11, 2026
Comment thread environment.yml
- h5py==3.14.0
- idna==3.11
- keras==3.13.2
- keras>=3.0.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this lowered? The vulnerabilities ask for this to be 3.12.1+.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keras is lowered because 3.13.2 dosn't support python 3.10 version, since we have python 3.10 workflows, Instead of removing it, I loosen the dependency to satisfy the requirement.

Comment thread setup.py Outdated
'protobuf>=4.25.2,<6.0.0;python_version>="3.11"',
'protobuf>=4.21.6,<6.0.0;python_version<"3.11"',
'tensorflow>=2.17,<2.18',
'protobuf>=4.25.2,<7.0.0',
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In environment.yml you set protobuf to 6.31. Why isn't this set to that level?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the protobuf version range to >=6.0.0,<7.0.0 in setup.py. Since 6.31.1 falls within this range, it should satisfy the requirement. As we generally avoid pinning exact versions in setup.py, I believe maintaining version ranges is better. If we pinned an exact version, anyone building S2T who requires a different protobuf version than 6.31.1 might encounter issues.

Comment thread environment.yml
- python>=3.10,<3.13

- pip
- pip:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this file include explicit specifications of component versions needed to address vulnerabilities? I don't see components like cryptography, jaraco-context, ipython, pyopenssl.

Also, I see some that have values that don't meet the "fix" level for the vulnerabilities

  • requests needs to be 2.33 (not 2.32.5)
  • pygments needs to be at 2.20 (not 2.19.2)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also updated the below dependencies to fix vulnerabilities, meanwhile these are transitive dependency's [cryptography, jaraco-context, ipython, pyopenssl.] that might be coming from diff components, I think we can't include them here..!

  • requests to 2.33
  • pygments to 2.20

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted.

Copy link
Copy Markdown

@rwitcher rwitcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few clarifications

@vkarampudi vkarampudi requested review from rwitcher April 14, 2026 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants